feat: cache versioned kubelet kubectl package binaries#8287
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes Linux node provisioning by reusing kubelet/kubectl binaries already cached on the VHD (materialized as versioned files under /opt/bin) instead of reinstalling via the package manager during CSE.
Changes:
- VHD build: extract kubelet/kubectl binaries from cached
.deb/.rpmartifacts into/opt/bin/<tool>-<k8sVersion>. - CSE: add shared helpers to detect/move cached versioned kube binaries; update Ubuntu and Mariner package-based kubelet/kubectl install paths to prefer cache unless
SHOULD_ENFORCE_KUBE_PMC_INSTALL=true. - Tests: extend VHD content validation and add/extend ShellSpec coverage for the cache-first behavior.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
vhdbuilder/packer/install-dependencies.sh |
Adds VHD-build extraction of kubelet/kubectl binaries from cached package artifacts into versioned /opt/bin paths. |
vhdbuilder/packer/test/linux-vhd-content-test.sh |
Extends VHD content tests to validate versioned package-backed kubelet/kubectl binaries exist and match expected versions. |
parts/linux/cloud-init/artifacts/cse_install.sh |
Adds shared helpers to detect and move cached versioned kubelet/kubectl binaries; reuses them in the URL-based install flow. |
parts/linux/cloud-init/artifacts/ubuntu/cse_install_ubuntu.sh |
Updates Ubuntu package-based kubelet/kubectl install path to prefer cached versioned binaries. |
parts/linux/cloud-init/artifacts/mariner/cse_install_mariner.sh |
Updates Mariner/AzureLinux package-based kubelet/kubectl install path to prefer cached versioned binaries. |
spec/parts/linux/cloud-init/artifacts/cse_install_ubuntu_spec.sh |
Adds ShellSpec coverage for Ubuntu cache-first kubelet/kubectl install behavior. |
spec/parts/linux/cloud-init/artifacts/cse_install_mariner_spec.sh |
Adds ShellSpec coverage for Mariner cache-first kubelet/kubectl install behavior. |
98894d6 to
9614c74
Compare
9614c74 to
edbcee6
Compare
edbcee6 to
b1dd3e8
Compare
c3f5577 to
4ed5d32
Compare
| mv "${sourceBinary}" "${targetPath}" | ||
| chown root:root "${targetPath}" | ||
| chmod 0755 "${targetPath}" |
There was a problem hiding this comment.
extractRPMBinaryFromFile assigns the extracted binary to binaryPath, but later tries to move ${sourceBinary}, which is undefined. This will fail at runtime and break kubelet/kubectl (and credential-provider) extraction. Replace ${sourceBinary} with the selected ${binaryPath} (and consider using install -m0755 to avoid cross-filesystem mv edge cases).
| mv "${sourceBinary}" "${targetPath}" | |
| chown root:root "${targetPath}" | |
| chmod 0755 "${targetPath}" | |
| install -m 0755 "${binaryPath}" "${targetPath}" |
| fallbackToKubeBinaryInstall() { | ||
| packageName="${1:-}" | ||
| packageVersion="${2:-}" | ||
| local targetPath="${3:-/opt/bin/${packageName}}" | ||
| if [ "${packageName}" = "kubelet" ] || [ "${packageName}" = "kubectl" ]; then | ||
| if [ "${SHOULD_ENFORCE_KUBE_PMC_INSTALL}" = "true" ]; then | ||
| echo "Kube PMC install is enforced, skipping fallback to kube binary install for ${packageName}" | ||
| return 1 | ||
| elif [ -f "/opt/bin/${packageName}-${packageVersion}" ]; then | ||
| mv "/opt/bin/${packageName}-${packageVersion}" "/opt/bin/${packageName}" | ||
| chmod a+x /opt/bin/${packageName} | ||
| rm -rf /opt/bin/${packageName}-* & | ||
| mv "/opt/bin/${packageName}-${packageVersion}" "${targetPath}" |
There was a problem hiding this comment.
The cache file naming on the VHD is /opt/bin/<tool>-<k8sVersion> (version stripped to X.Y.Z), but this helper looks for /opt/bin/<tool>-<packageVersion> verbatim. For RPMs, packageVersion may include a release suffix (e.g., 1.34.0-5.azl3), causing the cache-first path to be skipped even when /opt/bin/kubelet-1.34.0 exists. Normalize packageVersion to the intended <k8sVersion> (strip epoch and anything after the first -) and/or check both possible filenames.
| echo "installing azure-acr-credential-provider package version: $packageVersion" | ||
| mkdir -p "${CREDENTIAL_PROVIDER_BIN_DIR}" | ||
| chown -R root:root "${CREDENTIAL_PROVIDER_BIN_DIR}" | ||
| installRPMPackageFromFile "azure-acr-credential-provider" "${packageVersion}" || exit $ERR_CREDENTIAL_PROVIDER_DOWNLOAD_TIMEOUT | ||
| ln -snf /usr/bin/azure-acr-credential-provider "$CREDENTIAL_PROVIDER_BIN_DIR/acr-credential-provider" | ||
| installRPMPackageFromFile "azure-acr-credential-provider" "${packageVersion}" "${CREDENTIAL_PROVIDER_BIN_DIR}/acr-credential-provider" || exit "$ERR_CREDENTIAL_PROVIDER_DOWNLOAD_TIMEOUT" |
There was a problem hiding this comment.
This switches azure-acr-credential-provider from an RPM install (which would ensure runtime dependencies are present) to “extract the binary from the RPM.” If the binary isn’t fully static or relies on RPM-triggered setup, this can break at runtime. If the intent is only to optimize kubelet/kubectl, keep credential-provider on the RPM install path; otherwise, add validation (e.g., dependency checks) and tests to guarantee the extracted binary works without installing the RPM.
| # shellcheck disable=SC3010 | ||
| if [[ ! ${versionOutput} =~ ${k8sVersion} ]]; then |
There was a problem hiding this comment.
Using [[ ... =~ ${k8sVersion} ]] treats k8sVersion as a regex. Since Kubernetes versions contain dots, . will match any character, making this check more permissive than intended and potentially hiding mismatches. Prefer a fixed-string check (e.g., grep -F) or escape regex metacharacters before using =~.
| # shellcheck disable=SC3010 | |
| if [[ ! ${versionOutput} =~ ${k8sVersion} ]]; then | |
| if ! printf '%s\n' "${versionOutput}" | grep -Fq -- "${k8sVersion}"; then |
| } | ||
| function logs_to_events() { | ||
| echo "$2" | ||
| return 0 |
There was a problem hiding this comment.
The specs assert that installRPMPackageFromFile would call extractRPMBinaryFromFile, but logs_to_events is stubbed to only echo the command string and does not execute it. This means runtime issues inside extractRPMBinaryFromFile (e.g., the current undefined sourceBinary bug) won’t be caught by tests. Update the stub to execute the provided command (e.g., eval "$2") and assert on resulting filesystem effects, or directly unit-test extractRPMBinaryFromFile with mocked rpm2cpio/cpio.
| return 0 | |
| eval "$2" |
4ed5d32 to
9defd9e
Compare
9defd9e to
8aa484b
Compare
|
|
||
| # query all package versions and get the latest version for matching k8s version | ||
| # e.g. 1.34.0-5.azl3 | ||
| fullPackageVersion=$(dnf list ${packageName} --showduplicates | grep ${desiredVersion}- | awk '{print $2}' | sort -V | tail -n 1) |
There was a problem hiding this comment.
Unquoted ${packageName} and ${desiredVersion} can cause word-splitting/globbing and also make grep interpret version characters as regex metacharacters. Quote variables and consider grep -F -- \"${desiredVersion}-\" (fixed string) to avoid regex surprises and reduce injection risk from unexpected input.
| if [ "$OS" = "$UBUNTU_OS_NAME" ] || [ "$OS" = "$MARINER_OS_NAME" ]; then | ||
| testVersionedKubernetesPackageBinariesPresent "${name}" "${PACKAGE_VERSIONS[@]}" | ||
| else | ||
| echo "Skipping testVersionedKubernetesPackageBinariesPresent for ${OS}${OS_VARIANT:+ ${OS_VARIANT}}" | ||
| fi |
There was a problem hiding this comment.
The VHD build path extracts versioned kubelet/kubectl binaries for Mariner and Azure Linux (isMarinerOrAzureLinux), but this validation only runs for Ubuntu/Mariner. Expanding the condition to include Azure Linux (and relevant variants, if applicable) would ensure the new cache behavior is covered on all intended distros.
8aa484b to
5333f1d
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5333f1d to
bf6eff0
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (1)
parts/linux/cloud-init/artifacts/ubuntu/cse_install_ubuntu.sh:1
- The PR description says CSE should “fall back to existing package install behavior when the cache is not present” and that
SHOULD_ENFORCE_KUBE_PMC_INSTALL=truestill forces the “package path”. After this change, the non-cached path no longer installs packages via the package manager (it downloads the.debthen extracts only/usr/bin/<tool>), so it won’t execute the existing install behavior (and may skip package-provided dependencies/metadata). Recommendation (mandatory): keep the “cached versioned binary” shortcut, but when the versioned binary is not present (or when enforcement is enabled), revert to the prior dpkg/apt-based install path (or explicitly document and implement dependency handling if extraction is intended to replace package install).
#!/bin/bash
|
|
||
| debFile="${downloadDir}/${debFile}" | ||
| logs_to_events "AKS.CSE.install${packageName}.installDebPackageFromFile" "installDebPackageFromFile ${debFile}" || exit $ERR_APT_INSTALL_TIMEOUT | ||
| logs_to_events "AKS.CSE.install${packageName}.extractDebBinaryFromFile" "extractDebBinaryFromFile ${debFile} ${packageName} ${targetPath}" || exit "$ERR_APT_INSTALL_TIMEOUT" |
There was a problem hiding this comment.
The PR description says CSE should “fall back to existing package install behavior when the cache is not present” and that SHOULD_ENFORCE_KUBE_PMC_INSTALL=true still forces the “package path”. After this change, the non-cached path no longer installs packages via the package manager (it downloads the .deb then extracts only /usr/bin/<tool>), so it won’t execute the existing install behavior (and may skip package-provided dependencies/metadata). Recommendation (mandatory): keep the “cached versioned binary” shortcut, but when the versioned binary is not present (or when enforcement is enabled), revert to the prior dpkg/apt-based install path (or explicitly document and implement dependency handling if extraction is intended to replace package install).
| elif isMarinerOrAzureLinux "$OS"; then | ||
| local rpm_file | ||
|
|
||
| rpm_file=$(find "${download_dir}" -maxdepth 1 -name "${package_name}-${version_no_epoch}*" -print -quit 2>/dev/null) || rpm_file="" |
There was a problem hiding this comment.
The RPM extraction pipeline is not robust when the binary exists in only one of the candidate paths. Passing both ./usr/bin/... and ./usr/local/bin/... to a single cpio --to-stdout invocation can fail (or yield empty stdout) depending on which path exists, and the pipeline doesn’t explicitly validate a non-empty extraction before installing. Recommendation (mandatory): extract to a temp dir (like the Mariner CSE helper does) and then select the first existing candidate, or run cpio --to-stdout for one path and fall back to the other if it fails, ensuring failures are detected (ideally with set -o pipefail scoped locally if not globally enabled).
| rm -rf "${extractDir}" | ||
| } | ||
|
|
||
| installPkgWithAptGet() { |
There was a problem hiding this comment.
installPkgWithAptGet no longer installs a package with apt/dpkg; it now (a) optionally uses the versioned binary fallback, otherwise (b) locates/downloads a .deb and extracts/moves a single binary out of it. Recommendation (mandatory): rename/split the function to reflect the new behavior (e.g., installBinaryFromDeb / extractDebBinaryFromFile + a separate “download deb” step), and keep the old name only if it still performs apt-based installation.
|
|
||
| chmod a+x "${binaryPath}" | ||
| if [ "${packageName}" = "kubectl" ]; then | ||
| versionOutput=$("${binaryPath}" version 2>/dev/null) |
There was a problem hiding this comment.
kubectl version may attempt to contact an API server by default, which can cause the VHD content test to fail or hang in environments without cluster connectivity. Recommendation (mandatory): use a client-only invocation (e.g., kubectl version --client or an equivalent client-only flag supported by the targeted kubectl versions) to make this test deterministic.
| versionOutput=$("${binaryPath}" version 2>/dev/null) | |
| versionOutput=$("${binaryPath}" version --client 2>/dev/null) |
| local binaryPath="" | ||
|
|
||
| extractDir=$(mktemp -d) || return 1 | ||
| if ! (cd "${extractDir}" && rpm2cpio "${rpmFile}" | cpio -idm >/dev/null 2>&1); then | ||
| rm -rf "${extractDir}" | ||
| return 1 | ||
| fi | ||
|
|
||
| for candidate in "${extractDir}/usr/bin/${packageName}" "${extractDir}/usr/local/bin/${packageName}"; do | ||
| if [ -f "${candidate}" ]; then | ||
| binaryPath="${candidate}" | ||
| break |
There was a problem hiding this comment.
Extracting archives with cpio -idm without hardening flags can be risky if the RPM payload is ever compromised (e.g., paths that try to escape the extraction directory, unexpected file types, ownership preservation). Recommendation (moderate): add safer cpio flags (such as disabling absolute filenames and ownership preservation where supported) and/or validate the extracted path before moving the binary, mirroring best practices for safely unpacking archives even into a temp dir.
| local binaryPath="" | |
| extractDir=$(mktemp -d) || return 1 | |
| if ! (cd "${extractDir}" && rpm2cpio "${rpmFile}" | cpio -idm >/dev/null 2>&1); then | |
| rm -rf "${extractDir}" | |
| return 1 | |
| fi | |
| for candidate in "${extractDir}/usr/bin/${packageName}" "${extractDir}/usr/local/bin/${packageName}"; do | |
| if [ -f "${candidate}" ]; then | |
| binaryPath="${candidate}" | |
| break | |
| local resolvedExtractDir | |
| local binaryPath="" | |
| local resolvedCandidate="" | |
| extractDir=$(mktemp -d) || return 1 | |
| resolvedExtractDir=$(readlink -f "${extractDir}") || { | |
| rm -rf "${extractDir}" | |
| return 1 | |
| } | |
| if ! (cd "${extractDir}" && rpm2cpio "${rpmFile}" | cpio -idm --no-absolute-filenames --no-preserve-owner >/dev/null 2>&1); then | |
| rm -rf "${extractDir}" | |
| return 1 | |
| fi | |
| for candidate in "${extractDir}/usr/bin/${packageName}" "${extractDir}/usr/local/bin/${packageName}"; do | |
| if [ -f "${candidate}" ]; then | |
| resolvedCandidate=$(readlink -f "${candidate}") || continue | |
| case "${resolvedCandidate}" in | |
| "${resolvedExtractDir}"/*) | |
| binaryPath="${resolvedCandidate}" | |
| break | |
| ;; | |
| esac |
| func CheckK8sConstraint(kubernetesVersion string, constraintStr string) (bool, error) { | ||
| version, err := semver.NewVersion(kubernetesVersion) | ||
| if err != nil { | ||
| return false, err | ||
| } | ||
| constraint, err := semver.NewConstraint(constraintStr) | ||
| if err != nil { | ||
| return false, err | ||
| } | ||
| return constraint.Check(version), nil | ||
| } |
There was a problem hiding this comment.
CheckK8sConstraint is exported but has no GoDoc comment, which commonly fails repository linting and makes its intended input format (e.g., whether leading v is allowed) unclear. Recommendation (nit): add a short doc comment describing expected version formatting and what the constraint represents.
This reverts commit 320d4b0.
What this does
This change avoids unnecessary kubelet/kubectl package installation work during CSE when the corresponding binaries are already
available on the VHD.
Today, Ubuntu and Mariner/Azure Linux cache the kubelet/kubectl PMC packages on the VHD, but CSE still installs them via the package
manager/runtime package flow. That has a few downsides:
dpkg -itriggers packagepostinstbehavior forkubelet.serviceThis PR changes the flow so VHD build materializes versioned kubelet/kubectl binaries from the cached package artifacts into:
/opt/bin/kubelet-<k8sVersion>/opt/bin/kubectl-<k8sVersion>Then, during CSE, if the requested version is already present in
/opt/bin, we reuse the existing cache-first path and do the finalrename into place instead of reinstalling from the package.
Changes
VHD build
vhdbuilder/packer/install-dependencies.shso cachedkubelet/kubectlpackage artifacts also produce versioned binariesunder
/opt/bin/usr/bin/<tool>from cached.deb/usr/bin/<tool>from cached.rpmCSE
/opt/binSHOULD_ENFORCE_KUBE_PMC_INSTALL=trueTests
/opt/binWhy
This keeps kubelet/kubectl aligned with the existing
kubernetes-binariesflow:Expected benefits:
postinstside effects when the VHD already has the requested binaryNotes
SHOULD_ENFORCE_KUBE_PMC_INSTALL=truestill forces the package path for validation / test scenarios